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In previous work we have illustrated the benefits that compositional data types (CDTs) offer for im- 
plementing languages and in general for dealing with abstract syntax trees (ASTs). Based on Swier- 
stra's data types a la carte, CDTs are implemented as a Haskell library that enables the definition 
of recursive data types and functions on them in a modular and extendable fashion. Although CDTs 
provide a powerful tool for analysing and manipulating ASTs, they lack a convenient representation 
of variable binders. In this paper we remedy this deficiency by combining the framework of CDTs 
with Chlipala's parametric higher-order abstract syntax (PHOAS). We show how a generalisation 
from functors to difunctors enables us to capture PHOAS while still maintaining the features of the 
original implementation of CDTs, in particular its modularity. Unlike previous approaches, we avoid 
so-called exotic terms without resorting to abstract types: this is crucial when we want to perform 
transformations on CDTs that inspect the recursively computed CDTs, e.g. constant folding. 

1 Introduction 

When implementing domain-specific languages (DSLs) — either as embedded languages or stand-alone 
languages — the abstract syntax trees (ASTs) of programs are usually represented as elements of a recur- 
sive algebraic data type. These ASTs typically undergo various transformation steps, such as desugaring 
from a full language to a core language. But reflecting the invariants of these transformations in the type 
system of the host language can be problematic. For instance, in order to reflect a desugaring transforma- 
tion in the type system, we must define a separate data type for ASTs of the core language. Unfortunately, 
this has the side effect that common functionality, such as pretty printing, has to be duplicated. 

Wadler identified the essence of this issue as the Expression Problem, i.e. "the goal [. . . ] to define 
a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, 
without recompiling existing code, and while retaining static type safety" [24] . Swierstra | |22t elegantly 
addressed this problem using Haskell and its type classes machinery. While Swierstra's approach exhibits 
invaluable simplicity and clarity, it lacks features necessary to apply it in a practical setting beyond the 
confined simplicity of the expression problem. To this end, the framework of compositional data types 
(CDTs) [4| provides a rich library for implementing practical functionality on highly modular data types. 
This includes support of a wide array of recursion schemes in both pure and monadic forms, as well as 
mutually recursive data types and generalised algebraic data types (GADTs) [18]. 

What CDTs fail to address, however, is a transparent representation of variable binders that frees the 
programmer's mind from common issues like computations modulo a-equivalence and capture-avoiding 
substitutions. The work we present in this paper fills that gap by adopting (a restricted form of) higher- 
order abstract syntax (HOAS) [15], which uses the host language's variable binding mechanism to rep- 
resent binders in the object language. Since implementing efficient recursion schemes in the presence of 
HOAS is challenging [8, 13, 19, 25], integrating this technique with CDTs is a non-trivial task. 

Following a brief introduction to CDTs in Section |2j we describe how to achieve this integration as 
follows: 
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• We adopt parametric higher-order abstract syntax (PHOAS) (6], and we show how to capture this 
restricted form of HOAS via difunctors. The thus obtained parametric compositional data types 
(PCDTs) allow for the definition of modular catamorphisms a la Fegaras and Sheard [8| in the 
presence of binders. Unlike previous approaches, our technique does not rely on abstract types, 
which is crucial for modular computations that are also modular in their result type (Section|3]). 

• We illustrate why monadic computations constitute a challenge in the parametric setting and we 
show how monadic catamorphisms can still be defined for a restricted class of PCDTs (Section|4]l. 

• We show how to transfer the restricted recursion scheme of term homomorphisms [T| to PCDTs. 
Term homomorphisms enable the same flexibility for reuse and opportunity for deforestation I23l 
that we know from CDTs (Section|5]). 

• We show how to represent mutually recursive data types and GADTs by generahsing PCDTs in 
the style of Johann and Ghani ifTOll (Section |6]). 

• We illustrate the practical applicability of our framework by means of a complete library example, 
and we show how to automatically derive functionality for deciding equality (Section[7]l. 

Parametric compositional data types are available as a Haskell librar>[^ including numerous examples 
that are not included in this paper. All code fragments presented throughout the paper are written in 
(literate) Haskell [.llj . and the library relies on several language extensions that are currently only known 
to be supported by the Glasgow Haskell Compiler (GHC). 

2 Compositional Data Types 

Based on Swierstra's data types a la carte ||22]| . compositional data types (CDTs) 14] provide a framework 
for manipulating recursive data structures in a type-safe, modular manner. The prime application of 
CDTs is within language implementation and AST manipulation, and we present the basic concepts of 
CDTs in this section. More advanced concepts are introduced in Sections |4j [5} and|6] 

2.1 Motivating Example 

Consider an extension of the lambda calculus with integers, addition, let expressions, and error signalling: 
e '.'.= Xx.e I X I ei ^2 I I ^1 +^2 I letx = ei in ^2 I error 

Our goal is to implement a pretty printer, a desugaring transformation, constant folding, and a call-by- 
value interpreter for the simple language above. The desugaring transformation will turn let expressions 
let x = e\'\n e2 into {Xx.e2) e\. Constant folding and evaluation will take place after desugaring, i.e. both 
computations are only defined for the core language without let expressions. 

The standard approach to representing the language above is in terms of an algebraic data type: 

type Var = String 

data Exp = Lam Var Exp \ Var Var \ App Exp Exp \ Lit Int \ Plus Exp Exp \ Let Var Exp Exp \ Err 

We may then straightforwardly define the pretty printer pretty : : Exp — String. However, when we want 
to implement the desugaring transformation, we need a new algebraic data type: 

data Exp' = Lam' Var Exp' \ Var' Var \ App' Exp' Exp' \ Lit' Int \ Plus' Exp' Exp' \ Err' 
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That is, we need to replicate all constructors of Exp — except Let — into a new type Exp' of core expres- 
sions, in order to obtain a properly typed desugaring function desug ::Exp — )• Exp'. Not only does this 
mean that we have to replicate the constructors, we also need to replicate common functionality, e.g. in 
order to obtain a pretty printer for Exp' we must either write a new function, or write an injection function 
Exp' — )• Exp. 

CDTs provide a solution that allows us to define the ASTs for (core) expressions without having to 
duplicate common constructors, and without having to give up on statically guaranteed invariants about 
the structure of the ASTs. CDTs take the viewpoint of data types as fixed points of functors [12], i.e. the 
definition of the AST data type is separated into non-recursive signatures (functors) on the one hand and 
the recursive structure on the other hand. For our example, we define the following signatures (omitting 
the straightforward Functor instance declarations): 

data Lam a = Lam Var a data Lit a = Lit Int data Let a = Let Var a a 

data Var a = Var Var data Plus a = Plus a a data Err a = Err 

data App a = App a a 

Signatures can then be combined in a modular fashion by means of a formal sum of functors: 

data {f'.+'-g) a = Inl if a) \ Inr {g a) 

instance {Functor f, Functor g) =^ Functor (f:+:g) where 
fmapf (Inlx) =Inl(fmapfx) 
finapf (Inrx) = Inr (fmapfx) 

type Sig = Lam :+: Var:+:App :+:Lit :+:Plus :+:Err:+:Let 

type Sig' = Lam :+: Var:+:App :+:Lit :-{-■. Plus :-{-■. Err 

Finally, the type of terms over a (potentially compound) signature/ can be constructed as the (least) 
fixed point of the signature/: 

data Termf = In {out ::/ (Termf) } 

Modulo strictness. Term Sig is isomorphic to Exp, and Term Sig' is isomorphic to Exp'. 

The use of formal sums entails that each (sub)term has to be explicitly tagged with zero or more Inl 
or Inr tags. In order to add the right tags automatically, injections are derived using a type class: 

class sub :-<: sup where 
inj y.suba^supa 
proj:: sup a — t- Maybe {sub a) 

Using overlapping instance declarations, the subsignature relation :^: can be constructively defined ll22ll . 
However, due to the limitations of Haskell's type class system, instances are restricted to the form/ :-<: g 
where/ is atomic, i.e. not a sum, and g is a right-associated sum, e.g. gi :+: {g2 ■+-g3) but not {gi :+: 
82) '■+• 83- With the carefully defined instances for :^:, injection and projection functions for terms can 
then be defined as follows: 

inject:: {g :-<:/) =^ g {Termf) — )■ Termf 
inject = In . inj 

project:: {g :^:/) =^ Termf — Maybe {g {Termf)) 
project = proj . out 

Additionally, in order to reduce the syntactic overhead, the CDTs library can automatically derive 
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smart constructors that comprise the injections H, e.g. 

iPlus : : {Plus :~<:f) =^ Termf — t- Termf — t- Tennf 
iPlus xy = inject [Plus x y) 

Using the derived smart constructors, we can then write expressions such as \tix = 2 in {Xy.y + x) 3 
without syntactic overhead: 

e Term Sig 

e = iLet "x" {iLit 2) {{iLam "y" {Var "y" 'iPlus' Var "x")) HApp' Hit 3) 
In fact, the principal type of e is the open type: 

{Lam:<f , Var :<f,App:<f , Lit Plus :<.f, Let :<./) =^ Termf 
which means that e can be used as a term over any signature containing at least these six signatures! 

Next, we want to define the pretty printer, i.e. a function of type Term Sig — String. In order to make 
a recursive function definition modular too, it is defined as the catamorphism of an algebra 1,12,1 : 

type Algf a =f a 

cata "Functor f =^ Algf a — ?■ Termf — )• a 
cata <^> = .fmap [cata ) . out 

The advantage of this approach is that algebras can be easily combined over formal sums. A modular 
algebra definition is obtained by an open family of algebras indexed by the signature and closed under 
forming formal sums. This is achieved as a type class: 

class Pretty f where 
^rAty'-'- Algf String 

instance {Pretty f, Pretty g) =^ Pretty if :+:g) wliere 

•/•Pretty {M x) = pretty 
^Pretty (Inr x) = pretty X 

pretty:: {Functor f, Pretty f) =^ Termf — )• String 
pretty = cata (/)pretty 

The instance declaration that lifts Pretty instances to sums is crucial. Yet, the structure of its decla- 
ration is independent from the particular algebra class, and the CDTs library provides a mechanism for 
automatically deriving such instances [4|. What remains in order to implement the pretty printer is to 
define instances of the Pretty algebra class for the six signatures: 

instance Pretty Lam where 

^Pretty {Lam X e) = "+he^"y' 
instance Pretty Var where 

(/>Pretty {Var x) =X 

instance Pretty App where 

•pretty {App ^2) = " ( " ^ ^1 ^ " " ^ ^2 ^ " ) " 

instance Pretty Lit where 

^Pretty {Lit n) = show n 
instance Pretty Plus where 

(l>P,^tiy{Pluseie2) = "("+hei+^" + "^^2^")" 
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instance Pretty Let where 

^Pretty (Z.e?xei f-i) = "(let "VrXVr" = "+f£'i+f" in "VreiVr")" 
instance Pretty Err where 

pretty Err = " error " 

With these definitions we then have that pretty e evaluates to the string (let x = 2 in ((\y. (y + 
x) ) 3) ) . Moreover, we automatically obtain a pretty printer for the core language as well, cf. the type 
of pretty. 

3 Parametric Compositional Data Types 

In the previous section we considered a first-order encoding of the language, which means that we have 
to be careful to ensure that computations are invariant under a-equivalence, e.g. when implementing 
capture-avoiding substitutions. Higher-order abstract syntax (HOAS) [ 15] remedies this issue, by repre- 
senting binders and variables of the object language in terms of those of the meta language. 

3.1 Higher-Order Abstract Syntax 

In a standard Haskell HOAS encoding we replace the signatures Var and Lam by a revised Lam signature: 

data Lam a = Lam (a ^ a) 

Now, however. Lam is no longer an instance of Functor, because a occurs both in a contravariant po- 
sition and a covariant position. We therefore need to generalise functors in order to allow for negative 
occurrences of the recursive parameter. Difunctors ifTSll provide such a generalisation: 

class Difunctorf where 

dimap :: (a — )■ Zj) — )■ (c — t- d) — )•/ b c — )•/ a d 

instance Difunctor (— )•) where 
dimap f g h = g .h.f 

instance Difunctorf =^ Functor (f a) where 
finap = dimap id 

A difunctor must preserve the identity function and distribute over function composition: 

dimap id id = id and dimap (f -g) {h. i) = dimap g h . dimap f i 

The derived Functor instance obtained by fixing the contravariant argument will hence satisfy the functor 
laws, provided that the difunctor laws are satisfied. 

Meijer and Hutton fl3\ showed that it is possible to perform recursion over difunctor terms: 

data TermMHf = Inmn {outMH --f [TermMnf) {TermMnf) } 
cataMH '.'.Difunctorf ^ (f b a^ a) ^ {b — t-/ ab) ^ TermMHf — ^ ci 
cataMH ^ Y = ^ ■ dimap (anaMH V'^) {cataMH ^ V) ■ outMH 
anaMH '.'.Difunctorf ^ {f b a^ a) ^ {b — >■/ ab) ^ b ^ TerniMnf 
anaMH ^ V = ^^mh ■ dimap {cataMH $ W) {<^naMH ^ V) -V 

With Meijer and Hutton's approach, however, in order to lift an algebra (p '.'.f b a ^ a to a catamorphism, 
we also need to supply the inverse coalgebra Y'.'.b — )•/ b a. That is, in order to write a pretty printer we 
must supply a parser, which is not feasible — or perhaps even possible — in practice. 
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Fortunately, Fegaras and Sheard lU realised that if the embedded functions within terms are para- 
metric, then the inverse coalgebra is only used in order to undo computations performed by the algebra, 
since parametric functions can only "push around their arguments" without examining them. The solu- 
tion proposed by Fegaras and Sheard is to add a placeholder to the structure of terms, which acts as a 
right-inverse of the catamorphismj^ 

data Tenrifsf a = Inps if {Ternifsf a) {Ternipsf a)) \ Place a 
cataps :: Difunctor f ^ {f a a^ a) ^ Ternifsf a^ a 
cataps {Ififs t) =0 (dimap Place {cataps 0) t) 
cataps {Place x) =x 

We can then define e.g. a signature for lambda terms, and a function that calculates the number of bound 
variables occurring in a term, as follows (the example is adopted from Washburn and Weirich 1251): 

data T ab = Lam {a ^ b) \ App bb - T is a difunctor, we omit the instance declaration 

^ ::T Int Int — Int 

(Lamf) =f 1 
(Appxy) =x + y 

countVar:: Termps T Int — )• Int 
countVar = cataps </> 

In the Termps encoding above, however, parametricity of the embedded functions is not guaranteed. 
More specifically, the type allows for three kinds of exotic terms [25,1 . i.e. values in the meta language 
that do not correspond to terms in the object language: 

badPlace :: Termps T Bool 
badPlace = Inps {Place True) 

badCata:: Termps T Int 

badCata = Inps {Lam (Ax — )• countVar x = then x else Place 0)) 
badCase :: Termps T a 

badCase = Inps {Lam (Ax—)- case x of Termps {App ) — )• Termps {Appxx);_^x)) 

Fegaras and Sheard showed how to avoid exotic terms by means of a custom type system. Washburn and 
Weirich ||25]| later showed that exotic terms can be avoided in a Haskell encoding via type parametricity 
and an abstract type of terms: terms are restricted to the type V a . Termpsf a, and the constructors of 
Termps are hidden. Parametricity rules out badPlace and badCata, while the use of an abstract type 
rules out badCase. 

3.2 Parametric Higher-Order Abstract Syntax 

While the approach of Washburn and Weirich effectively rules out exotic terms in Haskell, we prefer a 
different encoding that relies on type parametricity only, and not an abstract type of terms. Our solution 
is inspired by Chlipala's parametric higher-order abstract syntax (PHOAS) [^. PHOAS is similar to 
the restricted form of HOAS that we saw above; however, Chlipala makes the parametricity explicit in 
the definition of terms by distinguishing between the type of bound variables and the type of recursive 
terms. In Chlipala's approach, an algebraic data type encoding of lambda terms LTerm can effectively be 
defined via an auxiliary data type LTrm of "preterms" as follows: 

^Actually, Fegaras and Sheard do not use difunctors, but the given definition corresponds to their encoding. 
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type LTerm = V a . LTrm a 

data LTrm a = Lam [a — t- LTrm a) \ Var a \ App {LTrm a) {LTrm a) 

The definition of LTerm guarantees that all functions embedded via Lam are parametric, and likewise 
that Var — Fegaras and Sheard's Place — can only be applied to variables bound by an embedded func- 
tion. Atkey |2| showed that the encoding above adequately captures closed lambda terms modulo en- 
equivalence, assuming that there is no infinite data and that all embedded functions are total. 



3.2.1 Parametric Terms 

In order to transfer Chlipala's idea to non-recursive signatures and catamorphisms, we need to distinguish 
between covariant and contravariant uses of the recursive parameter. But this is exactly what difunctors 
do! We therefore arrive at the following definition of terms over difunctors: 

newtype Termf = Term { unTerm :: V a . Trmf a } 

data Trmf a = In (f a {Trmf a)) \ Var a — "preterm" 

Note the difference in Trm compared to Termps (besides using the name Var rather than Place): the 
contravariant argument to the difunctor/ is not the type of terms Trmf a, but rather a parametrised type a, 
which we quantify over at top-level to ensure parametricity. Hence, the only way to use a bound variable 
is to wrap it in a Var constructor — it is not possible to inspect the parameter. This representation more 
faithfully captures — we believe — the restricted form of HOAS than the representation of Washburn and 
Weirich: in our encoding it is explicit that bound variables are merely placeholders, and not the same as 
terms. Moreover, in some cases we actually need to inspect the structure of terms in order to define term 



transformations — we will see such an example in Section 3.2.3 With an abstract type of terms, this is 
not possible as Washburn and Weirich note [25]. 

Before we define algebras and catamorphisms, we lift the ideas underlying CDTs to parametric com- 
positional data types (PCDTs), namely coproducts and implicit injections. Fortunately, the constructions 
of Section [2] are straightforwardly generalised (the instance declarations for :-<: are exactly as in data 
types a la carte II22I1 . so we omit them here): 

data {f :+: g) a b = Inl (f a b) \ Inr {g a b) 

instance {Difunctor f , Difunctor g) =^> Difunctor if :+:g) where 
dimapf g {Inl x) = Inl {dimapf g x) 
dimapf g {Inr x) = Inr {dimapf g x) 

class sub :-<■. sup where 
inj sub a b ^ sup a b 
proj :: sup ab ^ Maybe {sub a b) 

inject:: {g :-<:f) ^ g a {Trmf a) — t- Trmf a 
inject = In . inj 

project:: {g :-<:/) => Trmf a — ?■ Maybe {g a {Trmf a)) 
project {Term t) = proj t 
project {Var _) = Nothing 



We can then recast our previous signatures from Section 2.1 as difunctors: 

data Lam a b = Lam {a — t- b) data Lit ab = Lit Int data Let ab = Let b {a^b) 

data App ab = App b b data Plus ab = Plus b b data Err a b = Err 
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type Sig = Lam :+:App :+: Lit :+: Plus Err :-\-: Let 
type Sig' = Lam:+:App:+:Lit:+: Plus : + : Err 

Finally, we can automatically derive instance declarations for Difunctor as well as smart constructor 
definitions that comprise the injections as for CDTs H. However, in order to avoid the explicit Var 
constructor, we insert dimap Var id into the declarations, e.g. 

iLam:: (Lam X:/) =^ (Trmf a — t- Trmf a) — t- Trmf a 

iLamf = inject {dimap Var id (Lamf)) — (= inject (Lam (f . Var))) 

Using iLam we then need to be aware, though, that even if it takes a function Trmf a — Trmf a as 
argument, the input to that function will always be of the form Varx by construction. We can now again 
represent terms such as let .x = 2 in (Xy.y + x) 3 compactly as follows: 

e'.'.Term Sig 

e = Term (iLet (iLit 2) [Xx — t- [iLam [Xy — )■ y 'iPlus' x) 'iApp' iLit 3))) 



3.2.2 Algebras and Catamorphisms 

Given the representation of terms as fixed points of difunctors, we can now define algebras and catamor- 
phisms: 

type Algf a =f aa ^ a 

cata Difunctor f =^ Algf a — )■ Termf — )• a 
cata (Term t) = cat t 

where cat (In t) =0 (fmap cat t) — recall: fmap = dimap id 
cat (Varx) = x 

The definition of cata above is essentially the same as cataps- The only difference is that bound 
variables within terms are already wrapped in a Var constructor. Therefore, the contravariant argument 
to dimap is the identity function, and we consequently use the derived function fmap instead. 

With these definitions in place, we can now recast the modular pretty printer from Section 2.1 to 
the new difunctor signatures. However, since we now use a higher-order encoding, we need to generate 
variable names for printing. We therefore arrive at the following definition (the example is adopted from 
Washburn and Weirich 1.25 J . but we use streams rather than lists to represent the sequence of available 
variable names): 

data Stream a = Cons a [Stream a) 

class Pretty f where 

^Pretty : : Algf {Stream String — String) 

— instance declaration that lifts Pretty to coproducts omitted 

pretty:: {Difunctor f ^Pretty f) =^ Termf — t- String 
pretty t = cata 0pretty t {names 1 ) 

where names n = Cons ( ' x ' : show n) {names (n + 1)) 

instance Pretty Lam where 

(/>Pretty {Lamf) {Cons X xs) = " (W" -H-x-H- " . " Vrf {const x) xs Vr ")" 
instance Pretty App where 

^Pretty {App ei e2) XS = " {" Vr ex XS Vr " "-H-e2^5-H-")" 
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instance Pretty Lit where 

^Pretty (Lit «) _ = shoW It 

instance Pretty Plus where 

pretty (Plus e I e2)xs="i" +he I XS+h" + " +he2XS+h")" 
instance Pretty Let where 

^^ay {Let e]_e2) {Cons xxs) = "0-et "+fx+f" = "-H-eiXi+f 

" in " Vre2 {const x) xsVr")" 

instance Pretty Err where 
pretty Err _ = "error" 

With this implementation of pretty we then have that pretty e evaluates to the string (let xl = 2 in 
((\x2. (x2 + xD) 3)). 



3.2.3 Term Transformations 

The pretty printer is an example of a modular computation over a PCDT. However, we also want to 
define computations over PCDTs that construct PCDTs, e.g. the desugaring transformation. That is, we 
want to construct functions of type Termf — )• Term g, which means that we must construct functions 



of type (V a . Trmf a) ^ {\/ a . Trm g a). Following the approach of Section 3.2.2 we construct such 
functions by forming the catamorphisms of algebras of type Algf (V a . Trm g a), i.e. functions of type 
f (\l a . Trm go) (V a . Trm g a) a . Trm g a. However, in order to avoid the nested quantifiers, we 
instead use parametric term algebras of type V a .Algf {Trm g a). From such algebras we then obtain 
functions of the type V a . {Trmf a — >• Trm g a) as catamorphisms, which finally yield the desired functions 
of type (V a . Trmf a) ^ {\/ a . Trm g a). With these considerations in mind, we arrive at the following 
definition of the desugaring algebra type class: 

class Desugf g where 

0Desug V a .Algf {Trm g a) - not Algf {Term g) ! 

— instance declaration that lifts Desug to coproducts omitted 

desug {Difunctorf, Desug f g) =^ Termf — )• Term g 
desug t = Term {cata 0Desug t) 

The algebra type class above is a multi-parameter type class: it is parametrised both by the domain 
signature/ and the codomain signature g. We do this in order to obtain a desugaring function that is also 
modular in the codomain, similar to the evaluation function for vanilla CDTs [4J. 

We can now define the instances of Desug for the six signatures in order to obtain the desugaring 
function. However, by utilising overlapping instances we can make do with just two instance declara- 
tions: 

instance {Difunctorf ,f :-<: g) =^ Desugf g where 

0Desug = inject . dimap Var id — default instance for core signatures 
instance {App :<.f,Lam :-<:/) =^ Desug Letf where 

•fcesug {Eet eye2)= iLam e2 "iApp" e\ 

Given a term e :: Term Sig, we then have that desug e :: Term Sig', i.e. the type shows that indeed all 
syntactic sugar has been removed. 

Whereas the desugaring transformation shows that we can construct PCDTs from PCDTs in a mod- 
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ular fashion, we did not make use of the fact that PCDTs can be inspected. That is, the desugaring 
transformation does not inspect the recursively computed values, cf. the instance declaration for Let. 
However, in order to implement the constant folding transformation, we actually need to inspect recur- 
sively computed PCDTs. We again utilise overlapping instances: 

class Constff g where 

<^>Constf •■■.ya.Algf {Trm g a) 

— instance declaration that lifts Constf to coproducts omitted 

constf :: {Difunctorf, Constff g) => Termf — >■ Term g 

constf t = Term [cata 0constf t) 

instance {Difunctorf J :-<. g) =^ Constf f g where 

<^'Constf = inject . dimap Var id — default instance 
instance {Plus -.-^f , Lit :-<:f) ^ Constf Plus f where 
<?'Constf {Plus ei e'l) = case {project e\ project e^) of 

{Just {Lit n), Just {Litm)) iLit {n + m) 
_ e\ 'iPlus' £2 

Since we provide a default instance, we not only obtain constant folding for the core language, but also 
for the full language, i.e. constf has both the types Term Sig' — > Term Sig' and Term Sig — > Term Sig. 

4 Monadic Computations 

In the last section we demonstrated how to extend CDTs with parametric higher-order abstract syntax, 
and how to perform modular, recursive computations over terms containing binders. In this section we 
investigate monadic computations over PCDTs. 

4.1 Monadic Interpretation 

While the previous examples of modular computations did not require effects, the call-by-value inter- 
preter prompts the need for monadic computations: both in order to handle errors as well as controlling 
the evaluation order. Ultimately, we want to obtain a function of the type Term Sig' — ^ m {Sem m), where 
the semantic domain Sem is defined as follows (we use an ordinary algebraic data type for simplicity): 

data Sem m = Fun {Sem m—^m {Sem m) ) \ Int Int 

Note that the monad only occurs in the codomain of Fun — if we want call-by-name semantics rather than 
call-by-value semantics, we simply add m also to the domain. 

We can now implement the modular call-by- value interpreter similar to the previous modular com- 
putations, but using the monadic algebra carrier m {Sem m): 

class Monad m =^ Eval mf where 
(/)Evai"A/g/ (m {Sem m)) 

- instance declaration that hfts Eval to coproducts omitted 
eval:: {Difunctorf, Eval mf) =^ Termf m {Sem m) 

eval = cata <^vai 

instance Monad m Eval m Lam where 
(^vai {Lamf) = return {Fun {f . return)) 
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instance MonadError String m =^ Eval m App where 
</'Evai [App mx my) = do x mx 

case X of Fun f — )• my ^5=/ 

_ — >• throwError "stuck" 

instance Monad m =^ Eval m Lit where 

0Eval {Eit n) = return {Int n) 
instance MonadError String m =^ Eval m Plus where 

<fevai {Plus mx my) = do x mx 

y ^ my 

case (x,y) of {Int n,Int m) — t- return {Int {n + m)) 
_ throwError " stMck." 

instance MonadError String m =^ Eval m Err where 
^vai Err = throwError "error" 

In order to indicate errors in the course of the evaluation, we require the monad to provide a method 
to throw an error. To this end, we use the type class MonadError. Note how the modular design allows 
us to require the stricter constraint MonadError String m only for the cases where it is needed. This 
modularity of effects will become quite useful when we will rule out "stuck" errors in Section[6] 

With the interpreter definition above we have that eval {desug e) evaluates to the value Right {Int 5) 



as expected, where e is as of page 10 and m is the Either String monad. Moreover, we also have that 



+ error and + Ax.x evaluate to Left " error" and Left " stuck" , respectively. 



4.2 Monadic Computations with Implicit Sequencing 

In the example above we use a monadic algebra carrier for monadic computations. For vanilla CDTs ||4]|, 
however, we have previously shown how to perform monadic computations with implicit sequencing, by 
utilising the standard type class Traversabl^ 

type AlgM mf a =f a ^ ma 

class Functor/ =^ Traversable/ where 
sequence :: Monad m {ma) ^ m (/ a) 

cataM :: {Traversable /, Monad m) =^ AlgM m/ a — t- Term/ ma 
cataM = <=< sequence ./map {cataM ) . out 

AlgM m/ a represents the type of monadic algebras S over/ and m, with carrier a, which is different 
from A/g/ (m a) since the monad only occurs in the codomain of the monadic algebra. cataM is obtained 
from cata in Section [2] by performing sequence after applying /jtap and replacing function composition 
with monadic function composition <=<. That is, the recursion scheme takes care of sequencing the 
monadic subcomputations. Monadic algebras are useful for instance if we want to recursively project a 
term over a compound signature to a smaller signature: 

deepProject :: {Traversable g,/ :-<: g) =^ Term/ — )• Maybe {Term g) 
deepProject = cataM {li/tM In -proj) 

Moreover, in a call-by- value setting we may use a monadic algebra Alg/ m a rather than an ordinary 
algebra with a monadic carrier A/g/ (m a) in order to avoid the explicit sequencing of effects. 



'We have omitted methods from the definition of Traversable that are not necessary for our purposes. 
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Turning back to parametric terms, we can apply tiie same idea to difunctors yielding the following 
definition of monadic algebras: 

type AlgM mf a =f aa ^ ma 

Similarly, we can easily generalise Traversable and cataM to difunctors: 

class Difunctorf =^ Ditraversable f where 

disequence : : Monad m^f a (m b) ^ m (f a b) 

cataM :: (Ditraversable f , Monad m) =^ AlgM mf a — t- Termf ma 
cataM (j) (Term t) = cat t where cat (In t) = disequence (finap cat t) ^5= (/> 

cat (Varx) = return x 

Unfortunately, cataM only works for difunctors that do not use the contravariant argument. To see 
why this is the case, reconsider the Lam constructor; in order to define an instance of Ditraversable for 
Lam we must write a function of the type: 

disequence v. Monad m =^ Lam a (mb) ^ m (Lam a b) 

Since Lam is isomorphic to the function type constructor — )•, this is equivalent to a function of the type: 

y ab m. Monad m^ (a ^ m b) m (a ^ b) 

We cannot hope to be able to construct a meaningful combinator of that type. Intuitively, in a function 
of type a ^ m b, the monadic effect of the result can depend on the input of type a. The monadic 
effect of a monadic value of type m (a — )• b) is not dependent on such input. For example, think of a state 
transformer monad ST with state S and its put function put : : 5 — )• 5r () . What would be the corresponding 
transformation to a monadic value of type ST (5 — )• ())? 

Hence, cataM does not extend to terms with binders, but it still works for terms without binders 
as in vanilla CDTs JH. In particular, we cannot use cataM to define the call-by-value interpreter from 
Section gj] 

5 Contexts and Term Homomorphisms 

While the generality of catamorphisms makes them a powerful tool for modular function definitions, 
their generality at the same time inhibits flexibility and reusability. However, the full generality of cata- 
morphisms is not always needed in the case of term transformations, which we discussed in Section [3.2.3 
To this end, we have previously studied term homomorphisms H as a restricted form of term algebras. 
In this section we redevelop term homomorphisms for PCDTs. 

5.1 From Terms to Contexts and back 

The crucial idea behind term homomorphisms is to generalise terms to contexts, i.e. terms with holes. 
Following previous work [4] we define the generalisation of terms with holes as a generalised algebraic 
data type (GADT) [18] with phantom types Hole and NoHole: 

data Cxt:: *—)■(*—).*—;.*)—;.*—;•*—;•* where 
In '.'.fa (Cxt hf ab) ^ Cxt h fab 
Var '.'.a — )• Cxt h fab 
Hole ::b — )• Cxt Holef a b 
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data Hole 
data NoHole 

The first argument to Cxt is a pliantom type indicating whether the term contains holes or not. A 
context can thus be defined as: 

type Context = Cxt Hole 

That is, contexts may contain holes. On the other hand, terms must not contain holes, so we can recover 
our previous definition of preterms Tmi as follows: 

type Trmf a = Cxt NoHole f a () 

The definition of Term remains unchanged. This representation of contexts and preterms allows us to 
uniformly define functions that work on both types. For example, the function inject now has the type: 

inject:: {g :-<:/) ^ g a {Cxt hf ab) ^ Cxt hf a b 



5.2 Term Homomorphisms 

In Section [3. 2. 3| we have shown that term transformations, i.e. functions of type Termf — )• Term g, are 
obtained as catamorphisms of parametric term algebras of type V a -Algf {Trm g a). Spelling out the 
definition of Alg, such algebras are functions of type: 

^ a .f [Trm g a) [Trm g a) ^ Trm g a 

As we have argued previously ||4|, the fact that the target signature g occurs in both the domain and 
codomain in the above type prevents us from making use of the structure of the algebra's carrier type 



Trm g a. In particular, the constructions that we show in Section 5.3 are not possible with the above type. 

In order to circumvent this restriction, we remove the occurrences of the algebra's carrier type Trm g a 
in the domain by replacing them with type variables: 

y a b .f a b ^ Trm g a 

However, since we introduce a fresh variable b, functions of the above type are not able to use the 
corresponding parts of the argument for constructing the result. A value of type b cannot be injected into 
the type Trm g a. 

This is where contexts come into the picture: we enable the use of values of type b in the result 
by replacing the codomain type Trm g a with Context gab. The result is the following type of term 
homomorphisms: 

type Homf g = \/ a b .f a b ^ Context gab 

A function p : : Homf g is a transformation of constructors from/ into a context over g, i.e. a term over g 
that may embed values taken from the arguments of the /-constructor. The parametric polymorphism of 
the type guarantees that the arguments of the /-constructor cannot be inspected but only embedded into 
the result context. In order to apply term homomorphisms to terms, we need an auxiliary function that 
merges nested contexts: 

appCxt ::Difunctorf =^ Context f a {Cxt hf ab) ^ Cxt hf ab 
appCxt {In t) = In (fmap appCxt t) 
appCxt{Varx) =Varx 
appCxt {Hole h) =h 
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Given a context that has terms embedded in its holes, we obtain a term as a result; given a context with 
embedded contexts, the result is again a context. 

Using the combinator above we can now apply a term homomorphism to a preterm — or more gener- 
ally, to a context: 

appHom : : {Difunctor f , Difunctor g) =^ Homf g — t- Cxt hf ab ^ Cxt hg ab 
appHom p (In t) = appCxt (p (finap [appHom p) t)) 
appHom p {Var x) = Varx 
appHom p {Hole h) = Hole h 

From appHom we can then obtain the actual transformation on terms as follows: 

appTHom : : {Difunctor f, Difunctor g) =^ Homf g — ?• Termf — )• Term g 
appTHom p {Term t) = Term {appHom p t) 

Before we describe the benefits of term homomorphisms over term algebras, we reconsider the desug- 
aring transformation from Section 3.2.3[ but as a term homomorphism rather than a term algebra: 

class Desugf g where 
Pxiesvig'-'-Homf g 

— instance declaration that lifts Desug to coproducts omitted 
desug :: {Difunctor f , Difunctor g, Desugf g) =^ Termf — )• Term g 
desug = appTHom poesug 

instance {Difunctor f , Difunctor g,f :-<■. g) =^ Desugf g where 

PDesug = in .finap Hole . inj — default instance for core signatures 
instance {App ■.~<:f,Lam :-<:/) =^ Desug Letf where 

PDesug {Let ei e2) = inject {Lam {Hole . ^2)) 'iApp' Hole e\ 

Note how, in the instance declaration for Let, the constructor Hole is used to embed arguments of the 
constructor Let, viz. e\ and ^2, into the context that is constructed as the result. 

As for the desugaring function in Section 3.2.3 we utilise overlapping instances to provide a de- 
fault translation for the signatures that need not be translated. The definitions above yield the desired 
desugaring function desug:: Term Sig — >• Term Sig'. 



5.3 Transforming and Combining Term Homomorphisms 

In the following we shall shortly describe what we actually gain by adopting the term homomorphism 
approach. First, term homomorphisms enable automatic propagation of annotations, where annotations 
are added via a restricted difunctor product, namely a product of a difunctor/ and a constant c: 

data (f :&: c) ab =f ab:&:c 

For instance, the type of ASTs of our language where each node is annotated with source positions is 
captured by the type Term {Sig :&: SrcPos). With a term homomorphism Homf g we automatically get a 
lifted version Hom (f :&: c) {g :&: c), which propagates annotations from the input to the output. Hence, 
from our desugaring function in the previous section we automatically get a lifted function on parse trees 
Term {Sig :&: SrcPos) — >• Term {Sig' :&: SrcPos), which propagates source positions from the syntactic 
sugar to the core constructs. We omit the details here, but note that the constructions for CDTs H carry 
over straightforwardly to PCDTs. 
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The second motivation for introducing term homomorphisms is deforestation ||23]| . As we have 
shown previously p4l , it is not possible to fuse two term algebras in order to traverse the term only once. 
That is, we do not find a composition operator @ on algebras that satisfies the following equation: 

cata <p\ . cata 02 = cata {<pi @ <p2) for all <pi ::Alg g a and (/>2 - V a.Algf {Trm g a) 

With term homomorphism, however, we do have such a composition operator @: 

(@) : : {Difunctor g,Difunctor h) =^ Horn g h ^ Homf g — t- Homf h 
Pi ® P2 = appHom pi . p2 

For this composition, we then obtain the desired equation: 

appHom pi . appHom p2 = appHom (pi @ P2) for all p\ ■.■.Horn g h and p2 ■.■.Homf g 

In fact, we can also compose an arbitrary algebra with a term homomorphism: 

(□) ■.■.Difunctor g =^ Alg g Homf g — )• Algf a 
</> □ p =free (j) id.p 

where 

free :: Difunctor f =^ Algf [b ^ a) ^ Cxt hf ab ^ a 
free <^f [Int) = (j) (finap (free (j) f) t) 

free {Varx) =x 

free _f [Hole h) =fh 

The composition of algebras and homomorphisms satisfies the following equation: 

cata (j) . appHom p = cata (0 H p ) for all : : Alg g a and p : : Homf g 

For example, in order to evaluate a term with syntactic sugar, rather than composing eval and desug, 
we can use the function cata (0Evai □ Poesug), which only traverses the term once. This transformation 
can be automated using GHC's rewrite mechanism |[T4]| and our experimental results for CDTs show that 
the thus obtained speedup is significant lHJ. 

6 Generalised Parametric Compositional Data Types 

In this section we briefly describe how to lift the construction of mutually recursive data types and — 
more generally — GADTs from CDTs to PCDTs. The construction is based on the work of Johann and 
Ghani fUJ]. For CDTs the generalisation, roughly speaking, amounts to lifting functors to (generalised) 
higher-order functors [10], and functions on terms to natural transformations, as shown earlier |4||: 

type a-^b = \/ i .a i ^ b i 

class HFunctorf where 
hfmap wa-^b — t-/ a b 

Now, signatures are of the kind (*—)•*)—)•*—)•*, rather than *—)•*, which reflects the fact that signatures 
are now indexed types, and so are terms (or contexts in general). Consequently, the carrier of an algebra 
is a type constructor of kind *—)•*: 

type Algf a=f a^a 

Since signatures will be defined as GADTs, we effectively deal with many-sorted algebras. If a subterm 
has the type index /, then the value computed recursively by a catamorphism will have the type a i. The 
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coproduct :+: and the automatic injections :-<: carry over straightforwardly from functors to higher-order 
functors iH. 

In order to lift the ideas from CDTs to PCDTs, we need to consider indexed difunctors. This prompts 
the notion of higher-order difunctors: 

class HDifunctorf where 

hdimapv. {a^b) — (c-^d) — )•/ b c-^f a d 

instance HDifunctorf =^ HFunctor (f a) where 
hfmap = hdimap id 

Note the familiar pattern from ordinary PCDTs: a higher-order difunctor gives rise to a higher-order 
functor when the contravariant argument is fixed. 

To illustrate higher-order difunctors, consider a modular GADT encoding of our core language: 

data TArrow ij 
data Tint 

data Lam:: (*—;.*)—;.(*—;•*)—;•*—;•* where 
Lam :: [a i — )■ bj) — t- Lam a b {i 'TArrow' j) 

data A/:)/? ::(*—;•*)—;•(*—;•*)—;•*—;•* where 
App::b {i 'TArrow' j) — )• ft / — t- App abj 

data L;Y ::(*—>*) (* *) * * where 

Lit : : Int — )■ Lit a b Tint 
data Plus:: (*—).*)—).(*—).*)—).*—)•* where 

Plus :: b Tint — )• b Tint — t- Plus a b Tint 
data Err:: (*—).*)—).(*—).*)—).*—).* where 

Err : : Err a b i 
type Sig' = Lam :+:App :-\-:Lit:+: Plus :+: Err 

Note, in particular, the type of Lam: now the bound variable is typed! 

We use TArrow and Tint as type indices for the GADT definitions above. The preference of these 
fresh types over Haskell's — )• and Int is meant to emphasise that these phantom types are only labels that 
represent the type constructors of our object language. 

We use the coproduct :+: of higher-order difunctors above to combine signatures, which is easily 
defined, and as for CDTs it is straightforward to lift instances of HDifunctor for / and g to an instance 
for/ :+: g. Similarly, we can generalise the relation :-<: from difunctors to higher-order difunctors, so we 
omit its definition here. 

The type of generalised parametric (pre)terms can now be constructed as an indexed type: 

newtype Termf i = Term { unTerm::\/ a . Trmf a /} 
data Trmf ai = In (f a (Trmf a) i) \ Var (a i) 

Moreover, we use smart constructors as for PCDTs to compactly construct terms, for instance: 

e :: Term Sig' Tint 

e = Term {iLam (Ax — t- x 'iPlus' x) 'iApp' iLit 2) 



Finally, we can lift algebras and their induced catamorphisms by lifting the definitions in Sec- 
tion 3.2.2 via natural transformations and higher-order difunctors: 
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type Algf a =f aa-^a 

cata :: HDifunctor f =^ Algf a — ?• Termf a 
cata [Term t) = cat t 

where cat [In t) =0 [hfmap cat t) — recall: hfmap = hdimap id 
cat [Var x) = x 

With the definitions above we can now define a call-by-value interpreter for our typed example lan- 
guage. To this end, we must provide a type-level function that, for a given object language type con- 
structed from TArrow and Tint, selects the corresponding subset of the semantic domain Sem m from 



Section 4.1 This can be achieved via Haskell's type families [17]: 
type family Sem (m ::*—;■*) / 

type instance Sem m {i 'TArrow' j) = Sem mi^ m {Sem m j) 
type instance Sem m Tint = Int 

The type Sem mt is obtained from an object language type t by replacing each function type ti 'TArrow' t2 
occurring in t with Sem m t\ — )■ m {Sem mt2) and each Tint with Int. 

In order to make Sem into a proper type — as opposed to a mere type synonym — and simultaneously 
add the monad m at the top level, we define a newtype M: 

newtype M mi = M { unM : : m {Sem mi)} 

class Monad m =^ Eval mf where 

<fevai "/ {M m) {M m) i^m {Sem m i) - M . (j)Eva\ -'Algf {M m) is the actual algebra 
eval:: {Monad m,HDifunctorf,Eval mf) =^> Termf i — >■ m {Sem m i) 
eval = unM . cata {M . 0Evai) 

We can then provide the instance declarations for the signatures of the core language, and effectively 
obtain a tagless, modular, and extendable monadic interpreter: 

instance Monad m =^ Eval m Lam where 
(/lEvai {Lamf) = return {unM .f .M . return) 

instance Monad m =^ Eval m App where 
<tevai {App {M mf) {M mx)) = do/ ^ mf 

mx 

instance Monad m =^ Eval m Lit where 

(/>Eval {Lit n) = return n 
instance Monad m =^ Eval m Plus where 

(pEvai {Plus {M mx) {M my) ) = do x mx 

y my 
return {x + y) 

instance MonadError String m =^ Eval m Err where 
(/lEvai Err = throwError "error" 

With the above definition of eval we have, for instance, that eval e : : Either String Int evaluates to the value 
Right 4. Due to the fact that we now have a typed language, the Err constructor is the only source of 
an erroneous computation — the interpreter cannot get stuck. Moreover, since the modular specification 
of the interpreter only enforces the constraint MonadError String m for the signature Err, the term e can 
in fact be interpreted in the identity monad, rather than the Either String monad, as it does not contain 
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error. Consequently, we know statically that the evaluation of e cannot fail! 

Note that computations over generalised PCDTs are not limited to the tagless approach that we have 
illustrated above. We could have easily reformulated the semantic domain Sem m from Section 4. 1 as a 
GADT to use it as the carrier of a many-sorted algebra. Other natural carriers for many-sorted algebras 
are the type families of terms Termf, of course. 

Other concepts that we have introduced for vanilla PCDTs before can be transferred straightforwardly 
to generalised PCDTs in the same fashion. This includes contexts and term homomorphisms. 

7 Practical Considerations 

The motivation for introducing CDTs was to make Swierstra's data types a la carte ||22]| readily useful 
in practice. Besides extending data types a la carte with various aspects, such as monadic computations 
and term homomorphisms, the CDTs library provides all the generic functionality as well as automatic 
derivation of boilerplate code. With (generalised) PCDTs we have followed that path. Our library pro- 
vides Template Haskell |[20l code to automatically derive instances of the required type classes, such as 
Difunctor and Ditraversable, as well as smart constructors and lifting of algebra type classes to coprod- 
ucts. Moreover, our library supports automatic derivation of standard type classes Show, Eq, and Ord for 
terms, similar to Haskell's deriving mechanism. We show how to derive instances of Eq in the following 
subsection. Ord follows in the same fashion, and Show follows an approach similar to the pretty printer 
in Section 3.2.2[ but using the monad FreshM that is also used to determine equality, as we shall see 



below. 

Figure [T] provides the complete source code needed to implement our example language from Sec- 
tion |2. 1| Note that we have derived Show, Eq, and Ord instances for terms of the language — in particular 
the term e is printed as Let (Lit 2) (\a -> App (Lam (\b -> Plus b a)) (Lit 3)). 

7.1 Equality 

A common pattern when programming in Haskell is to derive instances of the type class Eq, for in- 



stance in order to test the desugaring transformation in Section 3.2.3 While the use of PHOAS ensures 
that all functions are invariant under a-renaming, we still have to devise an algorithm that decides a- 
equivalence. To this end, we will turn the rather elusive representation of bound variables via functions 
into a concrete form. 

In order to obtain concrete representations of bound variables, we provide a method for generating 
fresh variable names. This is achieved via a monad FreshM offering the following operations: 

withName {Name — )• FreshM a) — ?■ FreshM a 
evalFreshM : : FreshM a ^ a 

FreshM is an abstraction of an infinite sequence of fresh names. The function withName provides a fresh 
name. Names are represented by the abstract type Name, which implements instances of Show, Eq, and 
Ord. 

We first introduce a variant of the type class Eq that uses the FreshM monad: 

class PEq a where 

peq ::a ^ a^ FreshM Bool 

This type class is used to define the type class EqD of equatable difunctors, which lifts to coproducts: 
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class EqDf where 

eqD y.PEq a Name a -^f Name a — )• FreshM Bool 

instance {EqDf , EqD g) =^ EqD (f:+:g) where 
eqD{Inlx) {Inly) = x'eqD' y 
eqD {Inr x) {Inr y) = x 'eqD' y 
eqD _ _ = return False 

We then obtain equality of terms as follows (we do not consider contexts here for simplicity): 

instance EqDf =^ PEq {Trmf Name) where 
peq {In t\) {In 12) = t\ 'eqD' t2 
peq {Varxi) {Varxj) = return {x\ =X2) 
peq _ _ = return False 

instance {Difunctor f ,EqD f) =^ Eq {Termf) where 

(=) {Termx) {Termy) = evalFreshM {{xv. Trmf Name) 'peq' y) 

Note that we need to explicitly instantiate the parametric type in x to Name in the last instance declaration, 
in order to trigger the instance for Trmf Name defined above. 

Equality of terms, i.e. a-equivalence, has thus been reduced to providing instances of EqD for the 
difunctors comprising the signature of the term, which for Lam can be defined as follows: 

instance EqD Lam where 

eqD {Lamf) {Lam g) = withName {Xx — )•/ x 'peq' g x) 

That is,/ and g are considered equal if they are equal when applied to the same fresh name x. 

8 Discussion and Related Work 

Implementing languages with binders can be a difficult task. Using explicit variable names, we have to 
be careful in order to make sure that functions on ASTs are invariant under a-renaming. HOAS ifTSll is 
one way of tackling this problem, by reusing the binding mechanisms of the implementation language to 
define those of the object language. The challenge with HOAS, however, is that it is difficult to perform 
recursive computations over ASTs with binders ||8][T3][l£l|23. Besides what is documented in this paper, 
we have also lifted (generalised) parametric compositional data types to other (co)recursion schemes, 
such as anamorphisms and histomorphisms. Moreover, term homomorphisms can be straightforwardly 
extended with a state space: depending on how the state is propagated, this yields bottom-up resp. top- 
down tree transducers Q. 

Our approach of using PHOAS [6] amounts to the same restriction on embedded functions as Fegeras 
and Sheard [8l, and Washburn and Weirich [25 1. However, unlike Washburn and Weirich's Haskell im- 
plementation, our approach does not rely on making the type of terms abstract. Not only is it interesting 
to see that we can do without type abstraction, in fact, we sometimes need to inspect terms in order to 
write functions that produce terms, such as our constant folding algorithm. With Washburn and Weirich's 
encoding this is not possible. 

Ahn and Sheard |T| recently showed how to generalise the recursion schemes of Washburn and 
Weirich to Mendler-style recursion schemes, using the same representation for terms as Washburn and 
Weirich. Hence their approach also suffers from the inability to inspect terms. Although we could easily 
adopt Mendler-style recursion schemes in our setting, their generality does not make a difference in a 
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non-strict language such as Haskell. Additionally, Ahn and Sheard pose the open question whether there 
is a safe (i.e., terminating) way to apply histomorphisms to terms with negative recursive occurrences: 
although we have not investigated termination properties of our histomorphisms, we conjecture that the 
use of our parametric terms — which are purely inductive — may provide one solution. 

The finally tagless approach of Carette et al. ||5| has been proposed as an alternative solution to 
the expression problem (24). While the approach is very simple and elegant, and also supports (typed) 
higher-order encodings, the approach falls short when we want to define recursive, modular computations 
that construct modular terms too. Atkey et al. lO, for instance, use the finally tagless approach to build a 
modular interpreter. However, the interpreter cannot be made modular in the return type, i.e. the language 
defining values. Hence, when Atkey et al. extend their expression language they need to also change the 
data type that represents values, which means that the approach is not fully modular. Although our 
interpreter in Section 4.1 also uses a fixed domain of values Sem, we can make the interpreter fully 
modular by also using a PCDT for the return type, and using a multi-parameter type class definition 



similar to the desugaring transformation in Section 3.2.3 



Nominal sets [16] is another approach for dealing with binders, in which variables are explicit, but 
recursively defined functions are guaranteed to be invariant with respect to a-equivalence of terms. Im- 
plementations of this approach, however, require extensions of the metalanguage ||2T1 . and the approach 
is therefore not immediately usable in Haskell. 
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Parametric Compositional Data Types 



import Data.Comp.Param 
import Data.Comp.Param. Show C) 
import Data.Comp.Param, Equality () 
import Data, Comp. Param. Ordering C) 
import Data. Comp. Param. Derive 

import Control . Monad . Error (MonadError , thr owError) 

data Lam a b = Lam (a —i- b) 
data App a b = App b b 
data Lit a b = Lit Int 
data Plus a b = Plus b b 
data Let a b — Let b (a — )■ b) 
data Err a b = Err 



$ (derive [smart Cons true tors , makeDif unctor , makeShowD , makeEqD , makeOrdD] 
[ ' ' Lam , ' ' App , ' ' Lit , ' ' Plus , ' ' Let , ' ' Err] ) 

e : : Term (Lam :+: App :+: Lit :+: Plus :+: Let :+: Err) 

e = Term (iLet (iLit 2) (Xx (iLam (Xy y 'iPlus' x) 'iApp' iLit 3))) 



— * Desugaring 
class Desug f g where 
desugHom : : Horn f g 



$(derive [liftSum] [' 'Desug] ) — lift Desug to coproducts 

desug ;: (Difunctor f, Difunctor g, Desug f g) ^ Term f —> Term g 
desug (Term t) = Term (appHom desugHom t) 

instance (Difunctor f, Difunctor g, f :<: g) ^ Desug f g where 

desugHom = In . fmap Hole . inj — default instance for core signatures 

instance (App :<: f , Lam :<: f) Desug Let f where 

desugHom (Let el e2) = inject (Lam (Hole . e2)) 'iApp' Hole el 

— * Constant folding 
class Constf f g where 

constfAlg : : forall a, Alg f (Trm g a) 

$(derive [liftSum] [' 'Constf] ) — lift Constf to coproducts 

constf ;: (Difunctor f, Constf f g) =^ Term f Term g 
constf t — Term (cata constfAlg t) 

instance (Difunctor f, f :<: g) => Constf f g where 
constfAlg = inject . dimap Var id — default instance 

instance (Plus :<: f, Lit :<: f) ^ Constf Plus f where 
constfAlg (Plus el e2) = case (project el, project e2) of 

(Just (Lit n),Just (Lit m)) iLit (n + m) 
el 'iPlus' e2 



— * Call-by-value evaluation 

data Sem m = Fun (Sem m — > m (Sem m)) | Int Int 



class Monad m =^ Eval m f where 
evalAlg :: Alg f (m (Sem m)) 



$(derive [liftSum] [* *Eval] ) — lift Eval to coproducts 

eval :: (Difunctor f, Eval m f) ^ Term f — > m (Sem m) 
eval = cata evalAlg 



instance Monad m =^ Eval m Lam where 

evalAlg (Lam f) = return (Fun (f . return)) 



instance MonadError String m =^ 
evalAlg (App mx my) = do x ^ 
case 



Eval m App where 
mx 

X of Fun f my »= f 

— ^ throwError "stuck" 



instance Monad m =^ Eval m Lit where 
evalAlg (Lit n) = return (Int n) 



instance MonadError String m Eval m Plus where 
evalAlg (Plus mx my) = do x mx 
y my 

case (x,y) of (Int n,Int m) — ^ return (Int (n + m)) 
— )■ throwError "stuck" 



instance MonadError String m =i- Eval m Err where 
evalAlg Err = throwError "error" 



Figure 1: Complete example using the parametric compositional data types library. 



